Python Data Analytics by 2023
Author:2023
Language: eng
Format: epub
Chapter 6 â pandas in depth: data Manipulation
In addition, after an operation of aggregation, the names of some columns may not be very meaningful.
In fact it is often useful to add a prefix to the column name that describes the type of business combination.
Adding a prefix, instead of completely replacing the name, is very useful for keeping track of the source data from which they derive aggregate values. This is important if you apply a process of transformation chain (a series or dataframe is generated from another), because it is important to keep some reference with the source data.
>>> means = frame.groupby('color').mean(numeric_only=True).add_prefix('mean_')>>> means mean_price1 mean_price2
color
green 2.025 2.375
red 2.380 2.435
white 5.560 4.750
Functions on Groups
Although many methods have not been implemented specifically for use with GroupBy, they actually work correctly with data structures as the series. You saw in the previous section how easy it is to get the series by a GroupBy object, by specifying the name of the column and then by applying the method to make the calculation. For example, you can use the calculation of quantiles with the quantiles() function.
>>> group = frame.groupby('color')
>>> group['price1'].quantile(0.6)
color
green 2.170
red 2.744
white 5.560
Name: price1, dtype: float64
You can also define your own aggregation functions. Define the function separately and then pass it as an argument to the mark() function. For example, you can calculate the range of the values of each group.
>>> def range(series):
... return series.max() - series.min()
...
>>> group['price1'].agg(range)
color
green 1.45
red 3.64
white 0.00
Name: price1, dtype: float64
You can also use more aggregate functions at the same time, with the mark() function passing an array containing the list of operations to be done, which will become the new columns.
>>> group['price1'].agg(['mean','std',range])
mean std range
color
green 2.025 1.025305 1.45
red 2.380 2.573869 3.64
white 5.560 NaN 0.00
178
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Access | Data Mining |
Data Modeling & Design | Data Processing |
Data Warehousing | MySQL |
Oracle | Other Databases |
Relational Databases | SQL |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8293)
Azure Data and AI Architect Handbook by Olivier Mertens & Breght Van Baelen(6678)
Building Statistical Models in Python by Huy Hoang Nguyen & Paul N Adams & Stuart J Miller(6654)
Serverless Machine Learning with Amazon Redshift ML by Debu Panda & Phil Bates & Bhanu Pittampally & Sumeet Joshi(6526)
Data Wrangling on AWS by Navnit Shukla | Sankar M | Sam Palani(6317)
Driving Data Quality with Data Contracts by Andrew Jones(6269)
Machine Learning Model Serving Patterns and Best Practices by Md Johirul Islam(6030)
Learning SQL by Alan Beaulieu(5988)
Weapons of Math Destruction by Cathy O'Neil(5778)
Big Data Analysis with Python by Ivan Marin(5333)
Data Engineering with dbt by Roberto Zagni(4334)
Solidity Programming Essentials by Ritesh Modi(3980)
Time Series Analysis with Python Cookbook by Tarek A. Atwan(3836)
Pandas Cookbook by Theodore Petrou(3546)
Blockchain Basics by Daniel Drescher(3292)
Hands-On Machine Learning for Algorithmic Trading by Stefan Jansen(2900)
Feature Store for Machine Learning by Jayanth Kumar M J(2808)
Learn T-SQL Querying by Pam Lahoud & Pedro Lopes(2791)
Mastering Python for Finance by Unknown(2743)
